## This repository contains the data and code used in "How alt-tech users evaluate search engines: Cause-advancing audits"

The folder is as follows:

main/
├── data/
│   ├── annotations/
│   │   ├── GPT_Annotations/
│   │   │   ├── annotationsv1/
│   │   │   ├── annotationsv2/
│   │   │   └── annotationsv3/
│   │   └── human_annotations/
│   ├── processed/
│   └── raw/
├── src/
└── results/


## DATA
`search_engine_list.csv` contains a list of the search engines we considered (modified from Wikipedia).


`Data` contains 3 directories: `raw`, `processed`, `annotations`. `raw` contains the raw data extracted from /pol/ (see: `src/01_4plebs_scraper.R`). `processed` contains the union of that data following minor cleaning (see: `src/02_unify_4plebs_data.py`). `annotations/human_annotations` contains human annotations done independently by 2 human annotators for a sample of the 4chan /pol/ comments. `annotations/GPT_Annotations` contains the GPT-4 labeling outputs for the three different prompts we describe in the paper and appendix (see `prompts.txt`).

### SRC

The following scripts executed sequentially to generate results. We only provide scripts for prompt v1, but each can be run for subsequent prompts by replacing each instance of annotationsv1 with annotationsv2 or annotationsv3. If replicating from scratch, the user will need to 

*update OpenAI API key in `03_gpt_api.py`
*update the clean_and_save_response function to address new gpt4 formatting errors

`01_4plebs_scraper.R`: pulls mentions of search engines from 4plebs
`02_unify_4plebs_data.py`: joins mentions of search engines from 4plebs; also contains minor cleaning
`03_gpt_api.py`: sends data to GPT-4o-mini for annotation.
`04_generate_results.R`: generates the visualizations and the table in the paper
`05_annotator_agreement.py`: evaluates annotator agreement



## Citation

```
@article{williams2024causeaudit,
  title={How alt-tech users evaluate search engines: Cause-advancing audits},
  author={Williams, Evan M and Carley, Kathleen M},
  publisher={Harvard Kennedy School Misinformation Review},
  year={2025}
}
```



